1. A I , D A T A A N D B I G D A T A :
O W N E R S H I P A N D
P R O T E C T I O N
D R A N D R E S G U A D A M U Z , U N I V E R S I T Y O F S U S S E X
8. G R O W I N G E C O N O M I C
I M P O R T A N C E O F
D A T A
• New business models are
practically predicated on the
existence, gathering and
processing of data.
• Contacts, social media, web
advertising, all is predicated
on it.
• Access to scientific data
may prove to be the
difference between life and
death.
9. B I G D A T A
• Big data is a broad term for
data sets so large or complex
that traditional data processing
applications are inadequate.
Challenges include analysis,
capture, data curation, search,
sharing, storage, transfer,
visualization, and information
privacy.
• From legal perspective, not a
big difference between small
data and big data for the most
part.
10. A I A N D D A T A
• The main interface between
data and AI is the use of
machines to gather and
process data automatically.
• Data mining processes
used to analyse and
produce new knowledge.
• IP issues related to AI could
be made more complicated
with data bases.
12. H A R G R E A V E S R E V I E W O F
I N T E L L E C T U A L P R O P E R T Y
“Text mining is one current example of a new technology which
copyright should not inhibit, but does. It appears that the current
non-commercial research ‘Fair Dealing’ exception in UK law will
not cover use of these tools under the current interpretation of
‘Fair Dealing’. In any event text mining of databases is often
excluded by the contract for accessing the database. The
Government should introduce a UK exception in the interim
under the non-commercial research heading to allow use of
analytics for non-commercial use, as in the malaria example
above, as well as promoting at EU level an exception to support
text mining and data analytics for commercial use.”
13. F I N C H R E P O R T
“Related to such moves has been a growth of interest in exploiting
the potential of text-mining tools to analyse and process the
information contained in collections or corpora of journal articles
and other documents in order to extract relevant information, to
manipulate it, and to generate new information. The use of such
techniques is not yet widespread, not least because arrangements
for making publications available for text mining can be complex,
and because the entry costs are high for those who lack the
necessary technical skills. But text mining offers considerable
potential to increase the efficiency, effectiveness and quality of
research, to unlock hidden information, and to develop new
knowledge.”
14. U K 2 0 1 4 E X C E P T I O N
29A Copies for text and data analysis for non-commercial research
(1) The making of a copy of a work by a person who has lawful access to the work
does not infringe copyright in the work provided that—
(a)the copy is made in order that a person who has lawful access to the work may carry
out a computational analysis of anything recorded in the work for the sole purpose of
research for a non-commercial purpose, and
(b)the copy is accompanied by a sufficient acknowledgement (unless this would be
impossible for reasons of practicality or otherwise).
(2) Where a copy of a work has been made under this section, copyright in the work is
infringed if—
(a)the copy is transferred to any other person, except where the transfer is authorised
by the copyright owner, or
(b)the copy is used for any purpose other than that mentioned in subsection (1)(a),
except where the use is authorised by the copyright owner.
15. J A P A N E S E E X C E P T I O N
• “For the purpose of information analysis (‘information
analysis’ means to extract information, concerned with
languages, sounds, images or other elements constituting
such information, from many works or other such
information, and to make a comparison, a classification or
other statistical analysis of such information; the same
shall apply hereinafter in this Article) by using a computer,
it shall be permissible to make recording on a memory, or
to make adaptation (including a recording of a derivative
work created by such adaptation), of a work, to the extent
deemed necessary(…)”
16. P R O P O S E D D I R E C T I V E O N C O P Y R I G H T I N
T H E D I G I T A L S I N G L E M A R K E T C O M ( 2 0 1 6 ) 5 9 3
F I N A L
•Art3. “Member States shall provide for an exception
to the rights provided for in Article 2 of Directive
2001/29/EC, Articles 5(a) and 7(1) of Directive
96/9/EC and Article 11(1) of this Directive for
reproductions and extractions made by research
organisations in order to carry out text and data
mining of works or other subject-matter to which
they have lawful access for the purposes of
scientific research.”
17. C O M O D I N I A M E N D M E N T S ( 2 0 1 7 )
• Art 3.1. Member States shall provide for an exception
to the rights provided for in Article 2 of Directive
2001/29/EC, Articles 5(a) and 7(1) of Directive
96/9/EC and Article 11(1) of this Directive for
reproductions and extractions to be made by a person
who has lawful access to works and other subject
matter provided that reproduction or extraction is used
for the sole purpose of text and data mining.
18. S O M E
C O N S I D E R A T I O N
S
• Is an exception needed?
• Temporary copies not
covered as infringement,
and most TDM could fall
under this.
• Cumulative copying may not
amount to infringement, no
substantive copying.
• No causal connection?
19. C R I T I C I S M S
• Differs in important elements to the UK fair dealing
exception.
• COMMUNIA: Exception will create a privileged class of text
and data miners. Excludes journalists and advocacy
groups.
• Large datasets behind paywalls.
• “Europe needs an exception that allows text and data
mining of lawfully accessible materials by anyone for any
purpose.”
21. D A T A B A S E
• A database is a collection
of information that is
organized so that it can
easily be accessed,
managed, and updated. In
one view, databases can
be classified according to
types of content:
bibliographic, full-text,
numeric, and images.
22. D A T A
• Any type of information
contained in the database.
• This can range from works
subject to copyright
protection, to non-protected
elements such as individual
facts and figures.
• Important to determine the
type of database to determine
possible protection.
23. D A T A B A S E
E L E M E N T S
• Structure: The
organisational ordering of
the database.
• Functional elements:
Headers, search queries,
tables, etc.
• Content.
26. T H E R E I S N O L E G A L P R O T E C T I O N O F
D A T A ( A S S U C H )
27. C O P Y R I G H T
P R O T E C T I O N O F
D A T A
• Individual data elements could
be protected under copyright.
• Simple collection of data does
not warrant copyright
protection.
• Articles, pictures, literature,
music, etc.
• Metadata could be covered as
digital rights management in
the shape of Rights
Management Information.
28. T R A D E
S E C R E T S
• Directive (EU) 2016/943 harmonises
practices around Europe on trade secrets.
• ‘trade secret’ is information
• it is secret in the sense that it is not, as a
body or in the precise configuration and
assembly of its components, generally
known among or readily accessible to
persons within the circles that normally deal
with the kind of information in question;
• it has commercial value because it is
secret;
• it has been subject to reasonable steps
under the circumstances, by the person
lawfully in control of the information, to
keep it secret;
29. T R A D E
S E C R E T S
• Sets out circumstances for
lawful acquisition, including
independent discovery, and
reverse engineering.
• Unlawful acquisition:
unauthorised access, copying
of the information, or
dishonest commercial
practices.
• Exceptions, proportionality
principle, remedies.
31. G D P R
• Contrary to common belief, the GDPR doesn’t create a new data
ownership right.
• The GDPR applies to ‘controllers’ and ‘processors’.
• A controller determines the purposes and means of processing
personal data.
• A processor is responsible for processing personal data on
behalf of a controller.
• Controllers and processors have certain obligations that could be
relevant to AI.
32. D A T A
M I N I M I S A T I O
N
• Controllers must ensure the
personal data is:
• adequate – sufficient to
properly fulfil your stated
purpose;
• relevant – has a rational link
to that purpose; and
• limited to what is necessary –
you do not hold more than
you need for that purpose.
33. T E C H N I C A L
I S S U E S
• DP by design: appropriate
technical and organisational
measures to implement the
data protection principles and
safeguard individual rights.
• Security of data: Controllers
must ensure that you have
appropriate security measures
in place to protect the
personal data they hold. Data
must have integrity.
36. D A T A B A S E S
• Databases are functional
• Content within the database
is often subject to copyright
protection.
• They are often implemented
through software (protected
by copyright).
• But the issue is not one of
copying.
37. C O P Y R I G H T
• Berne Convention Art. 2(5) (5):
Collections of literary or artistic
works such as encyclopaedias and
anthologies which, by reason of the
selection and arrangement of their
contents, constitute intellectual
creations shall be protected as
such, without prejudice to the
copyright in each of the works
forming part of such collections.
• This does not cover the whole
range of databases, particularly
non-creative ones.
38. T R I P S
• Art 10(2): 2. Compilations of
data or other material, whether
in machine readable or other
form, which by reason of the
selection or arrangement of
their contents constitute
intellectual creations shall be
protected as such. Such
protection, which shall not
extend to the data or material
itself, shall be without prejudice
to any copyright subsisting in
the data or material itself.
39. W I P O C O P Y R I G H T T R E A T I E S
• Art 5. Compilations of data or other material, in any form,
which by reason of the selection or arrangement of their
contents constitute intellectual creations, are protected as
such. This protection does not extend to the data or the
material itself and is without prejudice to any copyright
subsisting in the data or material contained in the
compilation.
• “Agreed statement”: The scope of protection for
compilations of data (databases) under Article 5 of this
Treaty, read with Article 2, is consistent with Article 2
of the Berne Convention and on a par with the relevant
provisions of the TRIPS Agreement.
40. F E I S T V R U R A L T E L E P H O N E 4 9 9 U . S .
3 4 0 ( 1 9 9 1 )
• Prior to the case, US used “sweat of the brow” doctrine.
• facts were not copyrightable, but compilations of facts could be.
• "There is an undeniable tension between these two propositions. Many
compilations consist of nothing but raw data -- i.e. wholly factual
information not accompanied by any original expression. On what
basis may one claim a copyright upon such work? Common sense
tells us that 100 uncopyrightable facts do not magically change their
status when gathered together in one place. … The key to resolving
the tension lies in understanding why facts are not copyrightable: The
sine qua non of copyright is originality.”
• Databases in the shape of Compilations are not subject to copyright
protection.
41. D A T A B A S E S
A N D
O R I G I N A L I T Y
• The result of Feist is that in
many jurisdictions
databases are not protected
under copyright law.
• Meeting the threshold of
originality for a collection of
facts becomes difficult, if not
impossible.
• But databases are very
valuable assets!
42. U K L A W
• S 3A. (1)In this Part “database” means a collection of
independent works, data or other materials which—
• (a)are arranged in a systematic or methodical way, and
• (b)are individually accessible by electronic or other means.
• (2)For the purposes of this Part a literary work
consisting of a database is original if, and only if, by
reason of the selection or arrangement of the contents
of the database the database constitutes the author’s
own intellectual creation.
43. S O …
• This means that in UK
copyright law the author’s
own “intellectual creation” is
required in the selection and
arrangement of the contents
of a database, a mere
gathering of data without
meeting this requirement is
not worthy of protection
because it does not meet
the originality test.
44. C A S E L A W T E S T I N G “ O W N
I N T E L L E C T U A L C R E A T I O N ”
• Bezpečnostní softwarová asociace v Ministerstvo kultury
C-393/09. Originality in GUI. Functional elements are not
original enough
• Navitaire Inc v Easyjet Airline Co. & Anor [2004] EWHC
1725 (Ch). Functional elements in software not
original.
• Infopaq International A/S v Danske Dagblades Forening C-
5/08. Selection may prove author’s own intellectual
creation.
45. F O O T B A L L D A T A C O L T D
A N D O T H E R S V Y A H O O !
U K L T D A N D O T H E R S C -
6 0 4 / 1 0
• The case involved the fixture lists of
football matches in the English and
Scottish leagues, which are
produced by a company called
Football DataCo.
• Yahoo! copied these fixtures without
paying licence fees, so Football
DataCo sued them alleging that by
doing so Yahoo! had infringed both
copyright and its database rights.
• Copyright can only be afforded to a
database if its structure is the
maker’s own intellectual creation.
46. F O O T B A L L
D A T A C O
• “…the significant labour and
skill required for setting up that
database cannot as such
justify such a protection if they
do not express any originality
in the selection or arrangement
of the data which that
database contains.”
• High threshold for originality in
databases, particularly
automated ones.
47. D A T A B A S E
R I G H T
• Enacted to enhance
European competitive
advantage and encourage
the creation of more
databases.
• No evidence for this claim.
• James Boyle calls it “faith-
based policymaking”.
48. D I R E C T I V E 9 6 / 9 / E C O N T H E
P R O T E C T I O N O F D A T A B A S E S
• Exists regardless of the existence of copyright protection in the
database, as the exclusive rights given to the database owner
are separate to those arising from copyright.
• Exclusive right given to the maker of a database
• Database is a collection of independent works, data or other
materials that are arranged in a systematic or methodical way,
and are individually accessible by electronic or other means.
• The right exists if “there has been a substantial investment in
obtaining, verifying or presenting the contents of the database”.
49. D A T A B A S E D I R E C T I V E ( C O N T )
• The right subsists for 15 years from the completion of
the same.
• The right is infringed if a person without authorisation
extracts or re-utilises all or a substantial part of the
contents of the database.
• The right is also infringed after continuous extraction
or re-utilisation of non-substantial parts of the
database.
50. D A T A B A S E R I G H T
• The right is not a real intellectual property right, (no qualitative
condition, but an economic criteria).
• The database producer must prove that he has paid "substantial"
investments.
• It is a protection for the maker/producer of the database (and non
for intellectual authors).
• It is not a right to control each data but only a protection against
unauthorized extraction and/or re-utilization of the whole or of a
substantial part, evaluated qualitatively and/or quantitatively, of
the contents of that database.
51. S O W E H A V E A T W O - P R O N G E D
P R O T E C T I O N
• Copyright protection of an original "structure" of database
- Only "creative" data (pictures, text, music) are protected by copyright.
Each copyrighted data is protected against non-private copy and re-use.
- "factual" data (statistics, raw scientific results) are not protected by
copyright.
• New "sui generis" rights protecting investments of the
producer against misappropriation of the database
contents
- but some reproductions of a set of factual data could be prohibited
if they are considered as reproduction of the database "in part”
52. B R I T I S H H O R S E R A C I N G B O A R D V
W I L L I A M H I L L C - 2 0 3 / 0 2
• British Horseracing Board (WH) (BHB) is the
governing authority for horse-racing in the UK. It
maintains a database containing extensive pre-race
information.
• William Hill, the leading bookmaker, repeatedly
obtained information indirectly from BHB’s database
(via third parties licensed by BHB) and used it on its
website.
• BHB sues for infringement of database right.
53. B H B V W H ( C O N T )
• 1) Whether WH’s use of data indirectly sourced from
BHB’s database constituted extraction or re-utilisation
of a substantial part of the BHB database;
(2) Whether WH’s actions amounted to a repeated and
systematic extraction or re-utilisation of insubstantial
parts of the database, such as to conflict with normal
exploitation of the database or unreasonably prejudice
the interests of the maker of the database.
54. B H B V W H ( C O N T )
• Finding no infringement of BHB’s right:
(1) The right under the Database Directive protects investment in seeking
out and collecting existing independent materials and collecting them in a
database. It does not protect investment in the creation of data. Indirect
sourcing, as opposed to mere consultation, may constitute extraction and
re-utilisation. The material used by WH was quantitatively insubstantial,
and as it had not been the subject of investment independent of that
required for its creation, it was not qualitatively a substantial part.
• (2) WH’s repeated and systematic extraction and re-utilisation of
insubstantial parts did not reconstitute and/or make available to the public
the whole or a substantial part of the contents of the database, and so did
not conflict with normal exploitation of it or seriously prejudice BHB’s
investment.
55. R E S U L T S O F
T H E
L I T I G A T I O N
• “By implication, the ECJ’s
decisions offer a partial solution
to one of the most obvious
deficiencies of the Directive,
the absence of a regime of
compulsory licensing to cure
the anticompetitive effects of
‘sole-source’ information
monopolies, such as those
exercised by BHB and
Fixtures.” Bernt Hugenholtz.
56. R E S U L T S O F T H E L I T I G A T I O N
• “The decisions of the ECJ certainly restrict the operation of the
Directive, particularly with their emphasis on the distinction between
creation of data and obtaining it. In addition, the court’s findings on the
meaning of ‘substantial part of the contents of a database’ ensure a
reasonably close connection between the investment that is intended
under the Directive to be the subject of protection and what constitutes
infringement of the database right. While that connection does not
prevent the protection of small quantities of data that represent
substantial ‘qualitative’ investment, that is an almost inevitable
consequence of the wording and structure of the Directive with its
notions of ‘qualitative’ investment and ‘qualitatively’ substantial part,
and its reliance on a regime of exclusive rights of extraction and
reutilisation in stead of less far-reaching unfair competition remedies.”
Hugenholtz
57. 2 0 0 5 C O M M I S S I O N
E V A L U A T I O N
58. @ T E C H N O L L A M A
All Your Data Are Belong To Us
Editor's Notes
A big warning about data ownership. People may not be willing to let you
Bake in the data into the design of a tool.
Blokchain GDPR.
Can AI be a controller? Processor?